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We  preaents^an  iterative  method  for  solving  large  sparse  nonsymmetric  linear  systems  of  equations  that 
enhances  Manteoffel's  adaptive  Chebyshev  method  with  a  conjugate  gradient-like  method.  The  new 
method  replaces  the  modified  power  method  for  computing  needed  eigenvalue  estimates  with  Arnoldi’s 
method,  which  can  be  used  to  simultaneously  compute  eigenvalues  and  to  improve  the  approximate 
solution.  Convergence  analysis  and  numerical  experiments  suggest  that  the  method  is  more  efficient 
than  the  original  adaptive  Chebyshev  algorithm. 
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1.  Introduction 

The  adaptive  Chebyshev  algorithm  of  Manteuffel  [11,  13}  is  an  iterative  method  for  solving  large 
sparse  real  nonaym metric  systems  of  linear  equations  of  the  form 

Ax  —  b,  (1) 

where  the  coefficient  matrix  A  has  positive-definite  symmetric  part.  Starting  from  an  initial  guess,  xfl) 
the  method  generates  a  sequence  of  iterates  {x-}  whose  residuals  {r  -«  b  -  Ax.}  satisfy 

rj  -  P/A)r0,  (2) 

where 

pi<*<  -  /  T(f)  ■  <3> 

T.  is  the  j’th  Chebyshev  polynomial  of  the  first  kind 
Tj(i) «—  cosh(j  cosh‘‘(t)), 

and  c  and  d  are  iteration  parameters  that  depend  on  the  convex  hull  of  the  spectrum  of  A.  Two 
properties  of  the  Chebyshev  polynomials  make  this  algorithm  effective.  First,  for  an  appropriate  choice 
of  the  iteration  parameters,  the  residual  polynomials  P-(A)  decrease  rapidly  in  norm,  so  that  the 
algorithm  is  rapidly  convergent  [13].  Second,  the  three-term  recurrence  for  Chebyshev 
polynomials  induces  an  inexpensive  recurrence  for  the  computation  of  each  iterate  Xj. 

Because  the  iteration  parameters  depend  on  the  convex  hull  of  the  spectrum  of  A,  estimates  of  the 
extreme  eigenvalues  of  A  are  needed.  Manteuffel’s  algorithm  computes  these  estimates  dynamically  [ll|. 
It  starts  with  a  (possibly  arbitrary)  guess  for  the  required  parameters  and  monitors  the  convergence  of 
the  iterates  generated.  If  convergence  is  deemed  unsatisfactory,  then  information  produced  during  the 
iteration  is  used  to  compute  eigenvalue  estimates.  These,  in  turn,  are  used  to  compute  new  iteration 
parameters,  and  the  Chebyshev  iteration  is  restarted  with  the  new  parameters.  This  adaptive  procedure 
is  repeated  until  good  iteration  parameters  are  found,  after  which  the  Chebyshev  method  can  proceed 
with  no  further  adaptive  steps. 


The  eigenvalue  computation  makes  use  of  the  residuals  generated  by  the  previous  Chebyshev  Iteration. 
The  underlying  numerical  method  is  a  modified  version  of  the  power  method.  If  the  values  of  the 
iteration  parameters  used  by  the  Chebyshev  iteration  are  inaccurate,  then  the  residuals  generated  may 
diverge.  Although  divergent  residuals  may  enhance  the  ability  of  the  adaptive  procedure  to  obtain 
accurate  eigenvalue  estimates  and  iteration  parameters,  the  residual  norms  may  increase  by  several 
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orders  of  magnitude  before  good  iteration  parameters  are  found  [5,  7].  Thus,  the  adaptive  Chebyshev 
method  may  do  a  considerable  amount  of  work  to  compute  iteration  parameters  before  it  makes  any 
improvement  in  the  accuracy  of  the  approximate  solution  of  the  linear  system. 

In  this  paper,  we  present  an  alternative  to  the  eigenvalue  computation  part  of  the  Manteuffel 
algorithm  that  decreases  the  sensitivity  of  the  Chebyshev  method  to  iteration  parameters.  We  replace 
the  modified  power  method  for  computing  eigenvalues  with  Arnoldi's  method  [I,  18],  a  generalisation  of 
the  Lancsos  method  (10]  that  estimates  the  eigenvalues  of  a  nonsymmetric  matrix  A  by  reducing  it  to 
upper-Hessenberg  form.  An  advantage  of  this  method  comes  from  its  relationship  to  conjugate  gradient¬ 
like  iterative  methods  for  solving  nonsymmetric  linear  systems  [5,  17,  20].  At  relatively  little  extra 
expense,  information  provided  by  Arnoldi's  method  can  be  used  to  perform  several  steps  of  an  iterative 
method  that  improves  the  quality  of  the  solution  iterate  prior  to  restarting  the  Chebyshev  iteration  with 
new  parameters.  The  hybrid  method  combines  the  basic  Chebyshev  method  with  this  coiyugate 
gradient-like  iteration,  which  is  performed  whenever  new  eigenvalue  estimates  are  computed. 

In  Section  2,  we  briefly  describe  the  original  adaptive  Chebyshev  method.  In  Section  3,  we  describe 
Arnoldi’s  method  and  its  relationship  to  conjugate  gradient-like  iterative  methods  for  nonsymmetric 
linear  systems,  and  we  present  a  convergence  result  for  one  of  these  iterative  methods.  In  Section  4,  we 
define  the  hybrid  method  and  discuss  its  advantages,  and  in  Section  5,  we  present  the  results  of  some 
numerical  experiments  comparing  the  performances  of  the  hybrid  method,  the  adaptive  Chebyshev 
method  and  the  CG-like  method  Orthomin  (4,  S,  23,  25]  in  solving  some  disc  retired  non-self-adjoint 
elliptic  partial  differential  equations. 
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S.  The  Adaptive  Chebyshev  Method 

la  this  section,  we  give  a  brief  overview  of  Manteuffel’s  adaptive  Chebyshev  method.  For  given 
iteration  parameters  c  and  d,  the  basic  Chebyshev  iteration  is  [13]: 

Algorithm  It  The  Chebyshev  method. 

I.  Start:  Choose  an  initial  guess  Xq,  compote  rQ  —  b  -  AXq  and  p0  -■  ^  rQ. 
g.  Iterate:  FOR  j— 0  STEP  1  UNTIL  convergence  DO: 


xj+l 

-Xj  +  Pj 

rj+l 

-b-Axj+, 

(2d/(2d2 

-ch 

°j+t 

"  tjd  - 

V. 

-  daj+l  - 1 

*V1 

~  °j+lfj+l + 

The  cost  is  one  matrix-vector  product  pins  2N  multiplications  per  step.  The  storage  required  is  4N 
words  for  Xj,  Axj,  r,  and  pj.  The  residuals  {r^ }  satisfy  (2)  and  (3),  and  Pj  is  a  member  of 

Pj  am  {real  polynomials  of  degree  j  such  that  Pj(0)— 1}.  (4) 

The  parameters  c  and  d  define  the  center,  d,  and  foci,  d±e,  of  a  family  of  confocal  ellipses  in  the 
complex  plane.  There  is  a  smallest  member  of  this  family,  the  smallest  ellipse,  that  contains  the 
spectrum  of  A.  If  the  closure  of  the  smallest  ellipse  does  not  contain  the  origin,  then  Algorithm 
1  converges.  Moreover,  convergence  is  nearly  optimum  in  the  sense  that  as  j  increases,  Pj  rapidly 
approaches  the  polynomial  in  P  with  minimum  uniform  norm  over  the  smallest  ellipse. 


If  the  spectrum  of  A  lies  in  the  right  half  plane,  then  there  is  an  infinite  number  of  smallest  ellipses, 
each  of  which  uniquely  corresponds  to  a  set  of  Chebyshev  iteration  parameters.  For  any  particular 
choice  of  parameters,  the  rate  of  convergence  of  the  Chebyshev  iteration  is  [13,  22,  24] 


(5) 

(®) 

t  The  iteration  count  for  convergence  is  (approximately)  inversely  proportional  to  the  reciprocal  of  the  rate 

of  convergence.  Hence,  the  test  ellipse  is  defined  to  be  that  smallest  ellipse  for  which  the  rate  of 
convergence  is  greatest.  The  adaptive  Chebyshev  method  starts  with  (possibly  arbitrary)  initial  vat-ms 


“  log  (max  S(A)), 


where 


d-«+Kd-s)*-ea],/* 
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for  the  iteration  parameters  and  monitors  the  convergence  of  the  Chebyshev  iteration  (Algorithm  1).  If 
convergence  is  deemed  unsatisfactory  (i.e.  the  residuals  are  diverging  or  converging  less  rapidly  than  (5) 
suggests)  after  step  a,  then  the  adaptive  procedure 

1.  estimates  eigenvalues  on  the  eonvex  hull  of  the  spectrum  of  A  (11),  and 

2.  computes  the  iteration  parameters  for  the  best  ellipse  containing  these  eigenvalue 
estimates  (13). 

The  Chebyshev  iteration  is  then  restarted  with  the  new  parameters.  The  adaptive  procedure  is  repeated 
as  often  as  is  deemed  necessary,  until  good  parameters  are  found,  after  which  the  Chebyshev  iteration  is 
performed  until  convergence. 

The  second  step  of  the  adaptive  procedure,  the  computation  of  iteration  parameters,  requires  negligible 
machine  resources,  and  we  omit  a  discussion  of  it  here. 

The  eigenvalue  estimates  are  computed  by  a  modified  power  method,  which  is  based  on  the  fact  that, 
asymptotically, 

Pj(«)~S  (*)», 

so  that 

«j  «  S(A)»r0 , 

where  S(A)  is  the  linear  operator  induced  by  S(i).  That  is,  the  residuals  resemble  the  vectors  generated 
by  the  power  method  for  S(A).  If,  for  given  iteration  parameters,  some  eigenvalue  of  S(A)  has  modulus 
greater  than  one  and  rQ  has  a  component  in  the  corresponding  eigenvector,  then  the  residuals  will  diverge 
bat  will  eventually  become  rich  in  that  eigenvector.  If  there  are  m  such  eigenvalues,  then  eventually  the 
sequence  of  m+1  residuals 

will  be  nearly  linearly  dependent.  Estimates  for  m  eigenvalues  of  S(A)  are  then  given  by  tbe  roots  of  the 
m’th-degree  polynomial 

*1  +  V  +  *"  +  °m,nFt  +  *“  ' 

whose  coefficients  are  the  solution  to  the  least  squares  problem 

IIV">W|]“  +  rs+Bls  *  (7) 

where  (rg,...^’t+m_i)  denotes  the  matrix  with  columns  (11).  Estimates  for  eigenvalues  of  A  can 

be  computed  from  the  relationship 
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*-S(X) 

between  eigenvalues  (ji)  of  S(A)  and  {X}  of  A. 

Hence,  the  modified  power  method  consists  of  m  Chebyshev  steps  to  generate  the  residuals  {r}?^+1, 
followed  by  the  computation  of  the  least  squares  solution  to  (7),  and  the  computation  of  new  eigenvalue 
estimates  and  iteration  parameters.  The  Chebyshev  steps  require  m  matrix- vector  products  and  2mN 
multiplications.  If  (7)  is  solved  using  the  normal  equations,  then  |(m2+3m)/2]N  multiplications  are 
needed  to  compute  the  inner  products 

(rs+j'ri+k)»  0<j<m-l,  0<k<m.  (8) 

Therefore,  the  dominant  cost  is  m  matrix-vector  products  plus  ((m2-f  7m)/2]N  additional  multiplications. 
The  storage  requirement  (over  that  of  the  Chebyshev  iteration)  is  mN  words  to  save  the  vectors 

Note  that  an  “unmodified”  power  method  could  be  used  instead  of  the  modified  power  method  by 
replacing  with  { A^r^} in  (7)  [11].  We  will  examine  a  technique  that  is  mathematically 

equivalent  to  the  unmodified  power  method  in  Section  5. 


3.  Arnold i's  Method  and  Its  Relation  to  Iterative  Linear  Solvers 
In  this  section,  we  describe  Arnoldi’s  method  for  computing  eigenvalues  of  nonsymmetric  matrices, 
show  how  it  can  be  used  as  the  basis  for  iterative  methods  for  solving  linear  systems,  and  derive  a 
convergence  bound  for  one  of  these  linear  solvers. 

Given  an  arbitrary  vector  v,  such  that  |jvj||2  —  1,  Arnoldi’s  method  [1,  18]  is  a  Galerkin  method  on 

the  Krylov  subspace  Km  -■  span{vJ,Av1 . Am'1v1]  for  approximating  the  eigenvalues  of  A.  That  is,  it 

finds  a  set  of  eigenvalue  estimates  {Xj,...,Xm}  such  that  there  exist  nontero  u.  €  Km,  i—l,...,m,  for  which 

(Au.  -  X.u.,v)  -  0,  i  -  l,...,m  (9) 

for  all  v  6  Km.  It  accomplishes  this  by  constructing  an  orthonormal  matrix  Vm  »  [v,,... ,vm]  whose 
columns  {▼jljUj  span  Km,  and  then  computing  the  eigenvalues  of  V^AVm. 
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Algorithm  St  Arnoldi’s  method. 

1.  Start:  Choose  an  initial  rector  vt  such  that  Hr|(|2«l,  and  a  step  number  m. 


2.  Iterate:  FOR  j-1  STEP  1  UNTIL  m  DO 

hjj  “  (AVj,Vj),  i— l|.«j 

Vi-ATi'SVi 

hj+i,i  “  Hvj+iHj 
Tj+i  “  Tj-»Vhi+i.i  • 


Notice  that  this  method  is  essentially  a  Gram-Schmidt  process  for  orthonormalising  the  Krylov 
sequence  {V|IAv1,...,Am*1V|}.  In  a  practical  implementation,  it  is  usually  more  suitable  to  use  a  modified 
Gram-Schmidt  process.  The  orthonormal  matrix  V_  is  such  that  V^AV  a  H  ,  where  H  is  the  mxm 
upper- Hessenberg  matrix  whose  (i j)  entry  is  the  scalar  h-.  The  method  generalises  the  symmetric 
Lancsos  algorithm  to  nonsymmetric  matrices.  Recall  that  in  the  symmetric  case,  Hffl  is  symmetric  and 
tridiagonal  [16], 

In  an  implementation,  it  is  not  necessary  to  compute  the  normalised  vectors  { } ;  it  suffices  to 
compute  and  save  the  norms  {||v.||2}.  It  is  also  not  necessary  to  compute  vm+r  With  these  conventions, 
the  cost  of  Arnoldi’s  method  is  m  matrix-vector  products  and  (m2+m)N  multiplications.  The  storage 
requirement  is  (m+l)N  words  for  w»d  Av  {20].  * 

Suppose  now  that  xfl  is  a  guess  to  the  solution  of  (1),  with  residual  rQ  b  -  AxQ.  Let  Km  — 

span{r0,Ar0,...,Anl'1r0}.  One  way  to  solve  (1)  iteratively  is  to  compute  an  approximate  solution  xm  £  Xq 

+  K  such  that  the  Galerkin  condition 
xo 

(rm»T)  -  0.  ▼  €  Km  (10) 

holds.  But  if  v(  —  rg/IJr^Jlj,  then  the  Arnoldi  vectors  {v.JJ^j  span  Km,  so  that 

xm  ~  x0  +  V.  - 

rm  •  ro  -  a*b  “  Msti  -  AV»y«  > 

for  some  yB  €  Rm.  Since  the  Arnoldi  vectors  are  orthonormal, (10)  is  imposed  by  computing 

-  "2W*. 

where  e^  is  the  j’th  unit  vector  in  Rra.  Hence,  the  algorithm  [17]:  t 
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Algorithm  Si  The  full  orthogonali  ration  method  (FOM). 

i.  Start:  Choose  an  initial  guess  Xq,  compute  rQ  —  b  -  Ax 0  and  Tj—r^/UrQUj. 

t.  Iterate:  Perform  m  steps  of  Algorithm  2  starting  with  Tj. 

S.  Form  the  eolation: 

s®1”  H»y»  “  IMaei  * 

compute  xm-x0  +  Vmjm  , 
where  V_  and  H  are  determined  by  Arnoldi’s  method. 

m  m 


Algorithm  3  is  also  referred  to  as  Amoldi’s  method  for  solving  linear  systems.  It  is  theoretically 
equivalent  to  the  ORTHORES  method  developed  by  Young  and  Jea  [25],  which  is  modelled  after  a 
version  of  the  conjugate  gradient  method  described  by  Engeli  et.  al.  [8]. 


A  drawback  of  Algorithm  3  is  that  the  approximate  solution  xm  does  not  satisfy  an  optimality 
property.  An  alternative  is  the  generalised  minimal  residual  method  (GMRES)  developed  by  Saad  and 
Schults  [20],  which  uses  the  Amoldi  basis  to  compute  the  point  xm  6  xQ  +  Km  whose  residual  norm 
|b  -  Axm||2  is  minimum.  Let  v,  —  r0/Hr0H2>  let  P  ”  ||r0||2,  and  let  Hm‘denote  the  (m+l)xm  matrix 
obtained  by  appending  to  Hm  a  row  with  single  nonsero  entry  hm+1  m  in  column  m.  Then  the  Arnoldi 
basis  matrices  Vm  and  Vm+,  -  [v, . v-+1]  satisfy 


AVm  “  Vn+1Hn. 


(11) 


The  GMRES  iterate  is  given  by  x0  +  t,  where  s  is  the  solution  of  the  least  squares  problem 


min  JJb  -  A(x0+s)|J2  —  min  ||r„  -  A«H2  —  min  ||0v,  -  AVmy||2  .  (12) 

i€K— 

Using  (11)  and  the  fact  that  Vm+1  is  orthonormal,  the  last  expression  in  (12)  is  equal  to 

-  Vm+l»myll2  “  1*1  “  •  (13) 

Hence,  the  GMRES  iterate  is  given  by  x^  +  Vmym,  where  ym  is  the  solution  to  the  upper-Hessenberg 
least  squares  problem  on  the  right  hand  side  of  (13). 

Algorithm  4s  The  generalised  minimal  residual  method. 

I.  Start:  Choose  an  initial  guess  x^  compute  r#  —  b  -  Ax,,  and  Vj—r^flrJj,  set  £  —  ||r0||2. 
t  Iterate:  Perform  m  steps  of  Algorithm  2  starting  with  vt. 

S.  Form  the  eolation:  Find  ym  mimimiting  |/!tei-H_y|L  and  compute  x_  —  x*  +  V  v  ,  where 
m  and  are  determined  by  Arnoldi  s  method. 


GMRES  is  a  generalisation  of  the  MINRES  algorithm  presented  by  Paige  and  Saunders  [15].  It  is 
mathematically  equivalent  to  Young  and  Jen’s  ORTHODIR  [25],  for  arbitrary  nonsingular  matrices  A. 
For  matrices  with  positive-definite  symmetric  part,  it  is  also  equivalent  to  the  generalised  conjugate 
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residual  method  [4,  5]  and  a  method  of  Axelsson  [2].  For  large  step  numbers,  it  requires  one  third  the 
multiplications  and  one  half  the  storage  of  these  methods  [20]. 


For  both  FOM  and  GMRES,  once  and  Hm  are  computed,  the  dominant  cost  of  computing  xm 

is  mN  multiplications.  Hence,  the  cost  of  both  methods  is  m  matrix-vector  products  plus  (tn2+2m)N 
multiplications.  In  addition  to  storage  for  Xj,  (m+l)N  words  are  needed  for  the  Arnoldi  computation. 
We  remark  that  for  both  methods,  the  residual  norm  ||b  -  Axm||2  can  be  monitored  during  Step  2 
without  explicitly  computing  xm,  so  that  Step  2  can  be  stopped  as  soon  as  the  approximate  solution  is 
sufficiently  accurate  [20]. 


An  error  analysis  of  GMRES  can  be  found  in  [20].  We  derive  a  new  result  here  that  will  demonstrate 

its  effectiveness  in  the  hybrid  method.  Note  that  the  residual  r  ■»  b  -  Ax  satisfies 

mm 

Kh  -  ltPm(AM2  > 

where  Pm  is  defined  by  (4).  Assume  that  A  is  diagonalizable, 

A  -  IMIT1,  (14) 


where  A  is  the  diagonal  matrix  of  eigenvalues  {^j}|Li  sn<*  U  ™  [0j,...,UjJ  is  the  matrix  of  eigenvectors  of 
A.  Note  that  U  and  A  may  be  complex.  Suppose  that  the  initial  residual  is  dominated  by  m 
eigenvectors,  i.e. 


■ 

where  ||e|L  is  small  in  comparison  to  ||E  cru-L,  and  that,  moreover  the  sum  in  (15)  satisfies 
•  j*l  J  I  • 

m 

if  some  complex  u.  appears  in  £  o-u-,  then  its  conjugate  u.  appears  also. 

«  j.j  l  i  * 


(15) 


(10) 


(In  general,  this  might  require  including  small  components  in  the  sum,  with  a  corresponding  increase  in 

m.) 


Theorem  Ii  If  A  is  diagonalisable  and  the  initial  residual  satisfies  (15)  -  (10),  then  the 
residual  norm  after  m  steps  of  GMRES  satisfies 


Mt  *  I'VJj  '»  M, . 

■  |Xl*X"  ■ 

where  c  «■  max  II  -v  • 

B  k>a  J-ll  Aj  I 

Proof!  Let  Um  —  [ul,...,mj,  Aa  —  diag(Xl,...,Xm),  and  «  —  («l,.-,om)T,  so  that  (15)  is 
equivalent  to 


r0  ~  Ub*  +  e- 


2 
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Consider  the  polynomial 
—  ■  ►X; 

which  satisfies  Pm(Xp  —  0,  1  <j  <m,  Pm(0)  —  1.  Hence 

fJA)us-VA>-°* 

so  that 

rj Air,  -  P„(A)U„.  +  P„(A)«  -  P„(A)e. 
Moreover,  by  10,  Pmhas  real  coefficients  so  that 

Kh  “  ™  l|Pffl(A)roll2  <  l|Pm(A)e||2 
<  IIUIIjlllT'llj  l|Pm(dn)||2  I|e||2 . 

The  assertion  then  follows  with 


c«  ”  l|Pm(^m)ll2  “  lpm(Xk)i- 


Q.E.D. 


Note  that  the  constant  cm  does  depend  on  and  it  may  not  be  small  if,  for  example,  these 

eigenvalues  are  small  relative  to  the  others.  However,  if  {Xj}jLi  m  ^e  m  dominant  eigenvalues  of  A 
(i.e.  |Xk>V)  for  j<m,  k>m),  then 

|Xk-Xj|  <  2|Xj| 
for  k  >  m,  so  that 
cm  C  2m  . 

ID 

Moreover,  if  {Xp™_t  are  large  relative  to  the  remaining  eigenvalues,  then  typically 

\\-\\  <  |Xj|. 

In  this  case,  cn  will  be  of  order  one,  and  the  m  steps  of  GMRES  reduce  the  residual  norm  to  the  order  of 
|eg3  provided  that  the  condition  number  of  U  is  not  too  large. 


4.  The  Hybrid  Method 

The  hybrid  method  combines  the  approaches  of  the  previous  two  sections.  It  uses  the  basic  Chebyshev 
iteration  of  Algorithm  1,  but  replaces  the  modified  power  method  for  computing  eigenvalues  with 
Araoldi's  method,  from  which  information  is  also  used  to  improve  the  solution  iterate.  Either  FOM  or 


.si-ae-’ 
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GMRES  could  be  used  for  the  solution  update;  we  favor  GMRES  because  of  its  minimization  property. 
In  the  following  implementation,  the  convergence  of  the  Chebyshev  iteration  is  monitored  by  examining 
the  norms  of  the  generated  residuals,  and  the  adaptive  procedure  is  invoked  if  the  residual  norm  exceeds 
a  specified  tolerance  r  relative  to  the  norm  of  rmifl  a  rmio(c,d),  the  smallest  residual  encountered  with 
the  current  iteration  parameters.  In  addition,  the  adaptive  procedure  is  invoked  periodically,  after  at 
most  s  Chebyshev  steps,  and  it  is  used  to  generate  initial  eigenvalue  estimates  from  which  initial 
iteration  parameters  are  obtained.1 

Algorithm  5:  The  hybrid  method. 

Choose  x0.  Compute  rfl  «■  b  -  Ax^. 

UNTIL  Convergence  DO 

Adaptive  Steps:  Set  Vj  —  the  current  normalized  residual,  perform  m 
Amoldi/GMRES  steps  (Algorithm  4),  and  use  the  new  eigenvalue 
estimates  to  update  (or  initialize)  the  iteration  parameters. 

Chebyshev  Steps :  Set  jmaI“j+9- 

WHILE  <|r,|,/|r<|J,<  r»d  j+1  <jm„) 

Compute  x-+1  by  the  Chebyshev  iteration. 

The  Chebyshev  step  requires  one  matrix-vector  product  and  2N  multiplications  per  iteration,  and  the 
adaptive  step  requires  m  matrix-vector  products  and  (m2+2m)N  multiplications.  As  with  the  modified 
power  method,  the  eigenvalue  estimates  provided  by  Arnoldi’s  method  lie  in  the  field  of  values  of  A  but 
not  necessarily  in  the  convex  hull  of  the  spectrum  of  A,  so  that  the  hybrid  method  is  only  rigorously 
applicable  to  linear  systems  with  positive-definite  symmetric  part.  The  storage  requirement  for  the 
adaptive  step  is  mN  words,  the  same  as  iot  the  modified  power  method,  since  the  first  Arnoldi  vector 
can  share  storage  with  the  residual  of  the  Chebyshev  iteration. 

There  are  two  main  differences  between  the  original  adaptive  Chebyshev  method  and  the  hybrid 
method: 

1.  Different  eigenvalue  computations:  the  adaptive  Chebyshev  method  uses  the  modified 
power  method  based  on  the  operator  S(A),  whereas  the  hybrid  method  uses  Arnoldi’s 
method,  which  is  based  on  a  Krylov  subspace  in  A. 

2.  Purification:  the  hybrid  method  uses  the  GMRES  steps  to  improve  the  approximate 


^anteuffel'*  method  [13|  for  computing  iteration  parameter*  from  eigenvalue  estimate*  i*  still  used. 


11 


solution. 

A  third  difference  is  that  in  the  hybrid  method,  the  initial  eigenvalue  estimates  provided  by  Arnoldi’s 
method  can  be  used  to  compute  initial  iteration  parameters;  the  original  Chebyshev  method  requires  an 
initial  guess. 

We  do  not  know  whether  the  use  of  Arnoldi’s  method  alone  offers  any  advantage,  i.e.  whether 
Arnoldi’s  method  provides  more  accurate  eigenvalue  estimates  than  the  modified  power  method. 
Arnoldi’s  method  is  mathematically  equivalent  to  the  “unmodified”  power  method  discussed  by 
Manteuffel  [11],  who  observed  no  significant  difference  between  the  unmodified  and  modified  methods. 
Numerical  experiments  comparing  the  two  techniques  are  described  in  Section  5. 


The  effect  of  the  GMRES  steps  can  be  explained  by  a  heuristic  analysis  based  on  Theorem  1.  Assume 
that  A  is  diagonalitable  as  in  (14).  If  the  initial  residual  for  the  hybrid  method  has  the  form 

N 

rft  —  E  Tfju.  , 


J  l 


then  after  s  Chebyshev  iterations,  the  residual  is  approximately  equal  to  [13] 


£  fjTjUj  , 
J*l  •  1  1 


(18) 


where 


s(xp 


d-V  +  [(d-X^  -  c2]1/2 
d  +[d2-c*]1/*  ’ 


and  c,  d  are  the  iteration  parameters  used  in  the  Chebyshev  step.  Suppose  that  these  parameters  are 
inaccurate,  so  that  the  components  in  the  directions  of  some  eigenvectors  are  not  being  damped  out. 
This  meaiw  that  some  of  the  (r)  satisfy  |Tj|>l,  so  that  |»?|  >  1  and  the  terms  with  these  coefficients 
dominate  (18).  Note  that  |r|  —  |F.|,  so  that  if  some  complex  eigenvector  is  not  being  damped  out,  then 
neither  is  its  conjugate.  For  some  m,  therefore,  r  satisfies  (15)  (with  er  —  r?7.)  and  (18).  If  the 
corresponding  eigenvalues  (X.)  of  A  are  the  dominant  ones,  then  Theorem  1  suggests  that  the  m  GMRES 
steps  purify  the  residual  of  the  eigenvectors  whose  coefficients  had  been  growing  during  tbe  Chebyshev 
iteration.  Moreover,  since  F  is  the  starting  vector  for  the  Arnoldi  computation  and  is  presumably  rich  in 
these  eigenvectors,  the  new  eigenvalue  estimates  will  be  good  approximations  to  the  corresponding 
eigenvalues.  Hence,  the  new  iteration  parameters  will  produce  Chebyshev  polynomials  that  continue  to 
damp  out  these  components. 


Although  the  correct  value  of  m  to  use  in  the  adaptive  step  is  not  known  in  general,  this  analysis  still 
shows  that  m  GMRES  steps  will  tend  to  damp  out  the  m  dominant  components  of  (18).  The  analysis 
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applies  as  well  even  if  the  iteration  parameters  are  accurate  but  not  optimal,  i.e.  the  Cbebyahev  iteration 
is  damping  out  all  components  but  better  parameters  exist.  In  this  case,  some  components  will  not  be 
damped  out  as  rapidly  as  others  during  the  Chebyshev  step,  and  these  will  eventually  be  dominant  in 
(18). 


Since  the  purification  step  seems  to  provide  the  important  advantage  of  the  hybrid  method,  it  is 
natural  to  ask  whether  a  similar  idea  can  be  implemented  with  the  modified  power  method,  which  uses 
{r^  «=<  S(Af+Jr0}j^  ,  to  compute  eigenvalue  estimates.  One  such  procedure  consists  of  computing 


*  e  x.+m  +  8Pan(r . Wt)  0°) 

for  which  ||r]|2  =*  ||b  -  Ax||2  is  minimum.  This  requires  the  solution  of  the  least  squares  problem 

min  !ir.+m  ~  )|’ajAr.+j!l2  '  (2°) 

To  solve  (20)  using  the  normal  equations,  it  is  necessary  to  compute  the  inner  products 

(Ar.+j'Ar.+k)>  0  <  i  <  k  <  (2i) 

(r.+m’Ar.+j)’  0  <  j  <  m-1.  (22) 

Note  that  the  recurrence  for  the  Chebyshev  iteration  induces  a  three-term  residual  recurrence 


Ari--i 


!+/»; 


'-rM+  _  .  _ 

J  J  J 


ri  ~  irri+> 


Therefore,  except  when  j— 0,  all  the  quantities  of  (21)  can  be  computed  in  terms  of 


(23) 


(f.+f'.+u)’  %mm  j’1dd+1»  o  —  k-l,k,k+l, 

which  are  available  from  the  modified  power  method  (see  (8)  above).  Similarly,  except  when  j««0  and 
j»m-l,  the  terms  of  (22)  are  available  from  the  modified  power  method.  Moreover,  the  same  trick  can 
be  used  for  j-»0  in  (21)  if  r^j  is  saved  and  ((r^J^i+k)}“iJ  j  are  computed;  and  for  j-*0  and  j-»m-l  in 
(22)  if  (r^j,rt)  and  (rg+m>ri+m)  a**  computed.  Hence  (20)  can  be  solved  with  a  total  of  m+3  inner 
products.  The  computation  of  x  requires  an  additional  mN  multiplications,  so  that  purification  can  be 
added  to  the  modified  power  method  with  (2m+3)N  multiplications.  Combining  this  with  the 
((m2+7m)/2]N  multiplications  and  m  matrix-vector  products  required  for  the  modified  power  method, 
the  cost  of  this  adaptive  procedure  is  m  matrix-vector  products  plus  |(Bi3+llm+6)/2|N  multiplications. 
This  contrasts  with  m  matrix-vector  products  and  (m3-f2m)N  multiplications  for  the  hybrid  method. 
Thus,  the  number  of  matrix-vector  products  is  the  same  as  for  the  hybrid  method,  but  the  number  of 
additional  operation*  is  different.  The  coefficient  of  N  for  the  additional  operation  counts  of  both 
methods,  for  several  values  of  m,  is  shown  in  Table  4-1.  The  storage  requirement  is  (m+l)N  words,  for 
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{rt+j}jLl  and  r^j,  which  is  N  greater  than  for  the  hybrid  method.2 


♦ - 

1 

n  =  2 

4 

6 

8 

10 

1 

Hybrid 

8 

24 

48 

80 

120 

1  Modified 

♦  — - —  — 

Power  with  Purification 

16 

33 

54 

70 

108 

Table  4-1:  Coefficient  of  N  in  multiplication  count  of  purification  adaptive  steps. 

Finally,  we  note  that  similar  methods  for  annihilating  eigencomponents  have  been  developed  in  slightly 
different  contexts  by  Saad  and  Sameh  [IS]  and  by  Jes  person  and  B uning  [10]. 


5.  Numerical  Experiments 

In  this  section,  we  describe  the  results  of  numerical  experiments  in  which  the  methods  discussed  above 
are  used  to  solve  some  no nsym metric  linear  systems  arising  from  the  discretisation  of  non-self-adjoint 
elliptic  boundary  value  problems.  We  examine  four  methods  based  on  four  choices  for  the  adaptive 
procedure: 

(A)  CHEB:  the  modified  power  method  with  no  purification; 

(B)  HYBRID:  Arnoldi’s  method  with  purification  by  GMRES; 

(C)  CHEB-MIN:  the  modified  power  method  with  purification  added  by  solving  (20); 

(D)  CHEB-ARNOLDI:  Arnoldi’s  method  without  purification. 

The  experiments  were  run  on  a  VAXl  1-780  in  double  precision  (55  bit  mantissa).  The  Chebyshev 
iterations  were  based  on  a  slightly  modified  version  of  Manteuffel's  Chebyshev  code  [12].  The 
eigenvalues  of  the  upper- Hessenberg  matrix  Hm  generated  by  Arnoldi’s  method  were  computed  using 
EISPACK  [21]. 

Table  5-1  summarises  the  work  and  storage  requirements  for  the  adaptive  procedures  of  each  of  the 
four  methods.  The  matrix-vector  products  are  denoted  by  Av. 


As  in  Algorithm  5,  the  adaptive  procedure  of  each  method  is  invoked  if 


*Note  that  the  ipse*  is  (It)  do**  not  contain  th*  most  recent  information  available  ,  since  is  excluded.  We  exclude  it 
to  avoid  th*  computation  of  Ara  in  (20).  Alio,  a  lea*  expensive  purification,  with  no  reference  to  rM,  could  b*  performed  if  r( 
war*  excluded.  The  given  method  i*  a  compromise  between  thee*  two  alternative*. 
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1 

CHEB 

1 

HYBRID 

1  CHEB-HIN  I 

CHEB-ARNOLDI 

Work 

1 

a  Av  v 

1 

a  Av  ♦ 

1  a  Av  «  1 

a  Av  ♦ 

1 

(a2*7a)N/2 

1 

(b2*2b)N 

|(a24lli»46)N/2  1 

(»2*b)N 

Storage 

1 

aN 

1 

aN 

1  (■♦1)N  1 

aN 

Table  6-1:  Work  and  storage  requirements  for  the  adaptive  procedures. 

(24) 

where  rmin  is  the  smallest  residua)  encountered  for  the  current  parameters,  and  r*— 2.  For  HYBRID  and 
CHEB-MIN,  it  is  also  invoked  after  at  most  a— 20  Chebyshev  steps  so  that  the  purincation  step  is 
performed  periodically.  Since  no  purification  occurs  in  CHEB  and  CHEB-ARNOLDI,  these  techniques 
allow  the  Chebyshev  iteration  to  proceed  if  the  convergence  seems  to  agree  with  the  predicted  rate  of 
convergence.3  HYBRID  and  CHEB-ARNOLDI  compute  initial  values  for  the  iteration  parameters  c  and  d 
from  eigenvalue  estimates  provided  by  Arnoidi’s  method  applied  to  the  initial  residual.  CHEB  and 
CHEB-MIN  use  c—*0  and  d*l  as  the  initial  iteration  parameters.  Following  [11],  we  use  m«*4  as  the 
si se  of  the  Arnold:  and  modified  power  bases  in  an  effort  to  identify  the  dominant  and  subdominant 
complex  eigenvalue  pair.  Table  5-2  contains  the  work  and  storage  costs  of  the  adaptive  procedures  for 
this  value  of  m. 

♦ - - - 4- - > - — ♦— - - - ♦- - - - - 4 

I  CHEfl  I  HYBRID  I  CHEB-MIN  I  CHEB-ARNOLDI  I 

♦- - - — -♦ — — — - ♦ — — - ♦ - - - — - - ♦ 

I  Work  |  4  A*  ♦  22N  I  4  Av  ♦  24N  I  4  A*  ♦  33N  |  4  Av  ♦  20N  I 

♦ - • - ♦ — - — — — - ♦ — — — - — - — ♦ — - - - ♦ 

I  Storage  |  4N  I  4N  I  SN  |  4N  | 

4 - - - -4 - - - 4 - 4 - - 4 - - - 4 

Table  6-Si  Costs  of  the  adaptive  procedures,  bH. 

For  the  test  problem,  we  use  the  elliptic  partial  differential  equation 

-  (e^yly  +  Kt(x+y)n,  +  ((x+y)uy  +  [l/(l+x+y)|u  -  f,  (25) 

where  fk*  real  scalar  parameter  and  the  right  hand  side  f  is  chosen  so  that  the  solution  is 

*lf  )  is  Um  inde*  of  Um  (list  Cbebvshev  iterate  eorreepooding  to  tbo  current  iteration  parameters,  then  asymptotically 
l*.4tl>jlt  “  bounded  by  mas  S(|X|p  for  X  C  *(A)  (13).  The  heuristic,  built  into  the  original  cods  [12],  is  to  compute  new 
parameters  only  if  Ir^J^/lr,!,  >  2  8(d)* 


u(x,y)  —  x  e*7  sin(xx)  sin(xy). 

We  pose  (25)  on  (he  unit  square  {0<xjr<  1}  with  homogeneous  Dirichlet  boundary  conditions  and 
discretize  using  the  fire-point  second  order  centered  finite  difference  scheme  on  a  uniform  47i47  grid, 
producing  a  linear  system 

Ax  -  b  (26) 

of  order  N  — »  2200.  We  use  the  values  7—5  and  7—50.  In  addition,  we  precondition  (26)  with 
incomplete  factorizations.  We  use  both  the  incomplete  LU  (ILU)  and  modified  incomplete  LU  (M1LU) 
factorizations  with  no  extra  fill-in  (see  (3,  5,  0,  14]  for  the  details  concerning  these  techniques).  The 
actual  linear  systems  on  which  the  various  iterative  methods  are  tested  have  the  form 

Ax  -  [AQ-1]  [Ox]  -  b  -  b, 

where  Q  is  the  preconditioning  matrix.  We  thus  have  four  test  problems: 

•  Problem  1:  7—5,  ILU  preconditioning 

•  Problem  2:  7—5,  MILU  preconditioning 

•  Problem  3:  7—50,  ILU  preconditioning 

•  Problem  4:  7—50,  MILU  preconditioning. 

For  all  tests,  the  initial  guess  is  xQ— 0  and  the  stopping  criterion  is  Br^g/Hr^  <  10  ®. 

Table  5-3  shows  the  number  of  iterations  required  to  satisfy  the  stopping  criterion,  where  an  iteration 
for  the  four  adaptive  Chebyshev  methods  is  defined  to  be  either  a  Chebyshev  step  or  an  Arnoldi  step. 
Thus,  one  iteration  does  not  correspond  to  a  fixed  amount  of  work,  although  each  iteration  contains  one 
matrix-vector  product. 

♦ - ♦- - ♦ - ♦ - - 

I  CHEB  I  HYBRID  I  CHEB-  I  CHEB-  I  ORTHO- 1 

|  I  I  MIN  | ARNOLDI I  NIN(1)I 


Probleu  1  1 

00 

1 

60 

1 

64 

1 

77 

1 

78 

Problen  2  1 

35 

1 

27 

1 

34 

1 

44 

1 

32 

Problsa  3  1 

36 

1 

42 

1 

31 

1 

69 

1 

82 

Problen  4  I 

33 

1 

27 

1 

31 

1 

34 

1 

21 

Table  S»Si  Iterations  to  convergence. 


Figures  5.1  -  5.4  show  the  performance  of  the  methods  on  each  of  the  four  problems.  The  coordinates 
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an  residual  norm  |jr- 1|2  (on  a  logarithmic  scale)  vs.  multiplications.  As  a  benchmark,  for  each  problem 
we  also  include  the  performance  of  the  conjugate  gradient-like  method  Orthomin(l)  [4,  5,  23,  25].  Note 
that  numerical  experiments  indicating  that  Chebyshev  methods  (as  well  as  Orthomin)  an  more  effective 
than  the  conjugate  gradient  method  applied  to  the  normal  equations  and  the  biconjugate  gradient 
method  an  presented  in  (5,  0,  7]. 

In  examining  this  data,  we  consider  three  main  issues: 

1.  the  effect  of  the  purification  steps  in  HYBRID  and  CHEB-MIN; 

2.  the  effect  of  the  diffennt  eigenvalue  estimators:  Arnoldi’s  method  in  HYBRID  and  CHEB- 
ARNOLDI  vs.  the  modified  power  method  in  CHEB  and  CHEB-MIN; 

3.  the  diffennt  choice  of  initial  parameters:  an  initial  Arnoldi  computation  in  HYBRID  and 
CHEB-ARNOLDI  vs.  initial  guesses  of  d— 1,  c«0  in  CHEB  and  CHEB-MIN. 

The  first  issue  is  cleanut:  for  all  four  problems,  the  method  with  purification  is  superior  to  its 
analogue  without  purification.  This  is  explained  bj  the  analysis  of  Section  4:  if  the  residuals  from  the 
Chebyshev  steps  an  diverging,  then  the  purification  essentially  annihilates  the  eigenvector  components 
that  an  growing,  at  nlatively  little  extra  cost. 

A  direct  comparison  between  the  two  techniques  for  estimating  eigenvalues  is  somewhat  difficult 
because  of  the  diffennt  roles  of  the  growth  tolerance  parameter  r.  In  the  modified  power  method,  four 
Chebyshev  iterations  an  performed  after  the  condition  (24)  is  violated,  so  that  the  residuals  will  become 
very  rich  in  the  needed  eigenvectors.  In  contrast,  Arnoldi’s  method  is  performed  as  soon  as  (24)  is 
violated,  so  that  the  residuals  will  probably  not  be  dominated  as  much  by  these  eigenvectors.  Without 
purification,  Arnoldi’s  method  (in  CHEB-ARNOLDI)  does  not  seem  as  effective  as  the  modified  power 
method  (in  CHEB).  However,  the  combined  Arnoldi/GMRES  step  of  HYBRID  appears  to  be  more 
effective  than  the  purified  modified  power  step  of  CHEB-MIN.  It  is  both  less  expensive  (for  m— >4),  and 
it  strongly  limits  the  growth  of  the  nsidual. 

For  the  third  issue,  note  that  inaccurate  initial  iteration  parameters  cause  the  residuals  generated  by 
CHEB  and  CHEB-MIN  to  diverge  by  several  orders  of  magnitude  in  Problems  1,  2  and  4  (the  missing 
eigenvalues  take  some  time  to  assert  themselves  in  Problem  1).  This  difficulty  is  avoided  by  HYBRID  in 
Problems  1  and  2,  where  fairly  accurate  initial  eigenvalue  estimates  combine  with  the  strict  growth 
tolerance  r»»2  to  prevent  divergence.  HYBRID  does  not  handle  Problem  4  as  well.  This  is  because  the 
initial  Arnoldi  estimates  determine  a  domain  of  convergence  for  the  Chebjrshev  iteration  that  just  misses 
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one  eigenvalue,  and  the  next  Chebyshev  iteration  diverges  too  slowly  for  the  sdsptive  procedure  to  be 
invoked  until  the  maximum  number  of  20  steps  is  performed.  In  Problem  3,  the  eigenvalues  are 
clustered  near  1  so  that  the  initial  parameters  for  CHEB  and  CHEB-MIN  are  accurate,  whereas  Arnoldi’s 
method  has  some  difficulty  identifying  them.  The  use  of  Arnoldi’s  method  for  initial  eigenvalue 
estimates  tends  to  make  the  overall  performance  somewhat  smoother,  although  it  may  not  be  necessary 
if  good  initial  parameters  are  available. 

Finally,  note  that  the  performances  of  Orthomin(l)  and  the  Chebyshev  methods  are  very  close.  The 
slopes  of  the  Chebyshev  curves  are  steeper,  reflecting  their  lower  cost  per  step  [4,  5,  13],  but  the  overhead 
of  the  adaptive  steps  increases  their  total  cost. 
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